Sketching Data Sets for Large-Scale Learning: Keeping only what you need

نویسندگان

چکیده

Big data can be a blessing: with very large training sets it becomes possible to perform complex learning tasks unprecedented accuracy. Yet, this improved performance comes at the price of enormous computational challenges. Thus, one may wonder: Is leverage information content huge while keeping resources under control? Can also help solve some privacy issues raised by large-scale learning? This is ambition compressive learning, where set massively compressed before learning. Here, "sketch" first constructed computing carefully chosen nonlinear random features [e.g., Fourier (RF) features] and averaging them over whole set. Parameters are then learned from sketch, without access original article surveys current state art in including main concepts algorithms, their connections established signal processing methods, existing theoretical guarantees on both preservation preservation, important open problems. For an extended version that contains additional references more in-depth discussions variety topics, see [1].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sequential Learning with LS-SVM for Large-Scale Data Sets

We present a subspace-based variant of LS-SVMs (i.e. regularization networks) that sequentially processes the data and is hence especially suited for online learning tasks. The algorithm works by selecting from the data set a small subset of basis functions that is subsequently used to approximate the full kernel on arbitrary points. This subset is identified online from the data stream. We imp...

متن کامل

What you see is what you need.

We studied the role of attention and task demands for implicit change detection. Subjects engaged in an object sorting task performed in a virtual reality environment, where we changed the properties of an object while the subject was manipulating it. The task assures that subjects are looking at the changed object immediately before and after the change. Our results demonstrate that in this si...

متن کامل

Sketching Techniques for Large Scale NLP

In this paper, we address the challenges posed by large amounts of text data by exploiting the power of hashing in the context of streaming data. We explore sketch techniques, especially the CountMin Sketch, which approximates the frequency of a word pair in the corpus without explicitly storing the word pairs themselves. We use the idea of a conservative update with the Count-Min Sketch to red...

متن کامل

Find What You Need, Understand What You Find

The developments of the fields of Human-Computer Interaction (HCI) and Information Retrieval (IR) have followed parallel streams with both achieving significant impact in the early part of the 21 st century. The intersection of these two areas engages an active community of researchers who have influenced user interfaces for World Wide Web (WWW) sites and search engines (Marchionini, 2006). The...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Signal Processing Magazine

سال: 2021

ISSN: ['1053-5888', '1558-0792']

DOI: https://doi.org/10.1109/msp.2021.3092574